Treebanking by Sentence and Tree Transformation: Building a Treebank to support Question Answering in Portuguese

نویسندگان

  • Patrícia Nunes Gonçalves
  • Rita Santos
  • António Branco
چکیده

Abstract This paper presents CINTIL-QATreebank, a treebank composed of Portuguese sentences that can be used to support the development of Question Answering systems. To create this treebank, we use declarative sentences from the pre-existing CINTIL-Treebank and manually transform their syntactic structure into a non-declarative sentence. Our corpus includes two clause types: interrogative and imperative clauses. CINTIL-QATreebank can be used in language science and techology general research, but it was developed particularly for the development of automatic Question Answering systems. The non-declarative sentences are annotated with several layers of linguistic information, namely (i) trees with information on constituency and grammatical function; (ii) sentence type; (iii) interrogative pronoun; (iv) question type; and (v) semantic type of expected answer. Moreover, these non-declarative sentences are paired with their declarative counterparts and associated with the expected answer snippets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Syntax To Semantics Transformation: Application To Treebanking

Mapping between syntax and semantics is one of the most promising research topics in corpus annotation. This paper deals with the implementation of an semi-automatic transformation from a syntactically-tagged corpus into a semantic-tagged one. The method has been experimentally applied to a 1600-sentence treebank (the UAM Spanish Treebank). Results of evaluation are provided as well as prospect...

متن کامل

Optimizing question answering systems by Accelerated Particle Swarm Optimization (APSO)

One of the most important research areas in natural language processing is Question Answering Systems (QASs). Existing search engines, with Google at the top, have many remarkable capabilities. But there is a basic limitation (search engines do not have deduction capability), a capability which a QAS is expected to have. In this perspective, a search engine may be viewed as a semi-mechanized QA...

متن کامل

Question Answering Based on Semantic Graphs

In this paper we present a question answering system supported by semantic graphs. Aside from providing answers to natural language questions, the system offers explanations for these answers via a visual representation of documents, their associated list of facts described by subject – verb – object triplets, and their summaries. The triplets, automatically extracted from the Penn Treebank par...

متن کامل

Syntactic Annotation in the Columbia Arabic Treebank

Abstract The Columbia Arabic Treebank (CATiB) is a database of syntactic analyses of Arabic sentences. CATiB contrasts with previous approaches to Arabic treebanking in its emphasis on faster production with some constraints on linguistic richness. Two basic ideas inspire the CATiB approach. First, CATiB avoids the annotation of redundant linguistic information that is determinable automaticall...

متن کامل

Treebank of Chinese Bible Translations

This paper reports on a treebanking project where eight different modern Chinese translations of the Bible are syntactically analyzed. The trees are created through dynamic treebanking which uses a parser to produce the trees. The trees have been going through manual checking, but corrections are made not by editing the tree files but by re-generating the trees with an updated grammar and dicti...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012